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We investigate the recently proposed label-propagation algorithm (LPA) for identifying network 
communities. We reformulate the LPA as an equivalent optimization problem, giving an objective 

0^ ' function whose maxima correspond to community solutions. By considering properties of the ob- 

jective function, we identify conceptual and practical drawbacks of the label propagation approach, 
most importantly the disparity between increasing the value of the objective function and improv- 
ing the quality of communities found. To address the drawbacks, we modify the objective function 
in the optimization problem, producing a variety of algorithms that propagate labels subject to 
^"t . constraints; of particular interest is a variant that maximizes the modularity measure of commu- 

nity quality. Performance properties and implementation details of the proposed algorithms are 

qq | discussed. Bipartite as well as unipartite networks are considered. 
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There is great current interest in identifying communities in networks. Informally, communities in networks, or 
graphs, are subgraphs whose vertices are more strongly connected to one another than to the vertices outside the 
subgraph. A variety of approaches have been taken to make concrete the idea of communities, giving rise to a number 
of efficient methods for community identification (for useful overviews, see Refs. [I], H, H|). 

Recently, Raghavan et al. [4[ have introduced a label-propagation algorithm (LPA) for identifying network com- 
munities. Initially, each vertex in the graph is assigned a unique numeric label. The label for each vertex is replaced 
l— — '■ with the most frequent label from its neighbors. Relabeling continues until a stable set of labels is reached. Network 
\ communities are defined as the sets of vertices bearing the same labels. The LPA offers a number of desirable qualities, 
including conceptual simplicity, ease of implementation, and practical efficiency — the algorithm rapidly finds 
00 ', community assignments of high quality, as measured by the popular modularity measure [6( . 

The LPA was originally presented operationally, with communities defined as the outcome of a specific procedure. 
In this work, we consider an equivalent mathematical formulation, in which community solutions are understood in 
terms of optima of an objective function. We define an objective function H based on the number of edges that 
connect vertices with identical labels, and show that the LPA identifies local optima of H . This is formally equivalent 
to minimizing the Hamiltonian for a ferromagnetic Potts model !7]. The mathematical formulation exposes a number 
of interesting properties of the LPA. A feature of conspicuous importance is that the globally optimal solution for 
any network is the uninteresting trivial solution in which all vertices are assigned the same label, with other solutions 
found by label propagation corresponding to suboptimal local maxima of H. 

The objective function optimized by label propagation thus corresponds poorly to our conceptual understanding of 
communities — an increase in H need not produce what we would consider to be better communities. In particular, 
attempts to improve on the label propagation algorithm by facilitating its escape from local maxima in H may be 
counterproductive. We demonstrate that this can create practical difficulties for improvement upon the standard 
LPA. 

We next consider adding a term to the original objective function that penalizes undesirable solutions, producing 
algorithms that propagate labels subject to constraints. We examine several possibilities for the penalty term. Of 
special interest is a penalty term that works to divide vertices into groups of equal total degree, yielding a label prop- 
agation variant that strictly maximizes the modularity Q while maintaining the favorable computational complexity 
of the standard LPA. We characterize the effectiveness of the several label propagation algorithms through application 
to a model network and a selection of real-world networks. 
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The structure of the remainder of the paper is as follows. In section |TT] we briefly summarize the original operational 
presentation of the label propagation algorithm In section HTIl we reformulate the label propagation algorithm as a 
mathematical optimization problem, and in section Hvl consider drawbacks of the LPA thus revealed. We address the 
drawbacks in section [V] by adding constraints to the optimization problem, with attendant notes on implementation 
in the appendices. Performance of several label propagation variants are compared in section PVTl for both unipartite 
and bipartite networks. We conclude with a summary and discussion in section IYlIl 



II. THE LABEL PROPAGATION ALGORITHM 



The identification of communities in networks is a topic of great recent interest. Formulation of the problem 
presents two main challenges. First, the notion of community is imprecise, requiring a definition to be provided 
for what constitutes a community. Second, community solutions must also be practically realizable for networks of 
interest. The interplay between these challenges allows a variety of community definitions and community identification 
algorithms suited to networks of different sizes, as measured by the number of vertices n or edges m in the network. 

A prominent formulation of the community-identification problem is based on the modularity Q introduced by 
Newman and Girvan Q. The quality of communities given by a partition of the network vertices is assessed by 
comparing the number of edges between vertices in the same community to the number expected from a null model 
network. Formally, this is 



P ij )5{jg i ,g j ) , (1) 
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where the A^j are components of the adjacency matrix for the network, and gi is the community for vertex i. The 
presence of the Kronecker delta term S(gi,gj) restricts the sum to edges within communities. The probability of an 
edge existing between vertices i and j in the null model network is given by Pij . The standard choice of null model 
takes the probability of an edge to be proportional to the product of the degrees fcj and kj of the vertices, giving 



With this choice for Pu , the modularity becomes 



Communities are then sought by finding partitions of the set of vertices that have a high value for modularity. The 
global maximum of Q is generally inaccessible, as the number of possible partitions for a set grows too rapidly to 
be feasibly examined for all but the smallest networks, although effective heuristics exist for finding high modularity 
solutions. A seminal example is the greedy agglomerative hierarchical algorithm [8|, |9j , wherein pairs of communities 
are successively merged so as to cause the largest possible increase in Q at each step. 

Recently, Raghavan et al. [J] have introduced a label-propagation algorithm (LPA) for identifying network commu- 
nities. In contrast to the above modularity-based approach, communities are defined in the LPA as vertex partitions 
identified by a specific algorithm. The algorithm is conceptually simple in its operation. Initially, each vertex in the 
graph is assigned a unique numeric label. The label for each vertex is then replaced with the most frequent label 
amongst its neighbors; when several labels are equally frequent, the current label is kept if it is among the most 
frequent, while otherwise a new label is chosen at random from the most frequent. Vertices are repeatedly relabeled, 
with the algorithm terminating when the label for each vertex is (one of) the most frequent of the labels for the 
neighbors of the vertex. To avoid possible cycles and ensure termination, Raghavan et al. Q suggest updating the 
vertex labels asynchronously and in random order. Network communities are then associated with sets of vertices 
bearing the same labels. 

The LPA offers a number of desirable qualities. As described above, it is conceptually simple, being readily 
understood and quickly implemented. Communities found can be of high quality, as assessed, e.g., by the modularity. 
The algorithm is efficient in practice. Each relabeling iteration through the vertices has a computational complexity 
linear in the number of edges in the graph. The total number of iterations is not a priori clear, but relatively 
few iterations are needed to assign the final label to most of the vertices (over 95% of vertices in 5 iterations, see 
Refs. 0,1). 

Two related works are of particular note. First, Tibely and Kertesz Q have identified the label propagation 
algorithm as formally equivalent to minimizing the Hamiltonian for a kinetic Potts model, and used this to argue 
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that, at least in some networks, the iden tified communities m ay be meaningless. Additionally, through empirical 
investigation of two real-world networks, iTibelv and Kertesz! have shown that the number of distinct community 
solutions may be very large — much larger than the number of network vertices. Taken together, these observations 
highlight the need for further assessment of the quality of communities found using label propagation. 

Second, Leung et al. Q have examined the LPA as a basis for analyzing large networks, focusing on performance 
characteristics and limitations. They suggest a number of extensions and optimizations, resulting in a modified 
algorithm that is able to find communities in a network with tens of millions of edges in a few minutes using a desktop 
PC. This study thus suggests that label propagation has tremendous potential as an effective and efficient method 
for community identification. 



III. AN OBJECTIVE FUNCTION FOR LABEL PROPAGATION 



Thus far, the LPA has been presented operationally — the community solutions are defined as the outcome of a 
specific procedure. Alternatively, an equivalent mathematical formulation, first recognized by Tibely and Kertesz 0], 
can be given, where community solutions are understood in terms of the results of applying an optimization procedure 
to an objective function. The optimization procedure is the LPA, while the objective function remains to be specified. 
The mathematical reformulation thus requires defining the objective function, which provides an alternate means of 
understanding solutions found by the LPA. 

To effect this reformulation, we first express the LPA optimization procedure as 

l' v = argmax 6(l u ,l) , (4) 

where l u is the current label for vertex u, l' v is the new label for vertex v, a (v) is the set of vertices neighboring v 
in the network, and S is the Kronecker delta. In the event that multiple values would maximize the sum, the result 
of argmax; should be taken as for the procedural description of LPA, i.e., keep the current label if it would satisfy 
Eq. (01, otherwise take a label at random that satisfies Eq. (g]). 

Equation (U) can be written in terms of the adjacency matrix A for the network, giving 

n 

l' v = argmax } A uv S (l u ,l) , (5) 

where n is the number of vertices in the network. Consistent with the LPA, the adjacency matrix elements A uv are 
all elements of {0, 1}. However, the discrete nature of the A uv is never made use of, so the form in Eq. ((5]) is equally 
applicable to weighted networks. 

Next, we introduce an objective function H that is maximized by the optimization procedure. Intuitively, we can 
view the LPA as working to assign labels so as to increase the number of edges that connect vertices with identical 
labels. Formally, this number has the expression 

n 

H= 2^ ^ ■ (6) 

v=l uEtr(v) 

Equation ([6]) can be rewritten in terms of the network adjacency matrix, giving 

^ n n 

H = — ^ ' ^ ] A uv 5 (l u , l v 

) ■ (7) 

V — 1 u=l 

We note that maximizing H is equivalent to minimizing the Hamiltonian for a ferromagnetic Potts model; this 
connection has been previously recognized by Tibely and Kertesz Q. The use of a Potts model Hamiltonian in 
network partitioning has been explored in depth by Reichardt and Bornholdt [lol ]. 

It remains to be verified that the optimization rule in Eq. ([4]) does in fact maximize the objective function in Eq. (|7|). 
Consider updating the label for some vertex x. We rewrite Eq. ([7]) to treat vertex x separately, yielding 

I I n " \ 

H = — I y ' ^ t A UV S (l u , l v ) + ^ ] A UX S (l u , l x ) + ^ ] A XV S (l X} l v ) — A xx J . (8) 
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Taking advantage of the symmetry of the adjacency matrix, we can simplify Eq. ([8]), giving 

H = \l^2^2 A uvS(lu,lv)-A xx \ +^TA ux 5{l u ,l x ) . (9) 

\ v^x u^x J u—1 

The final term on the right hand side of Eq. (J5|) is exactly of the form maximized by the LPA optimization rule as 
expressed in Eq. ([5]), while the other terms are independent of the label on vertex x. Thus, the objective function 
never decreases under the action of the LPA, ultimately reaching a local maximum or limit cycle. 

An important property of the label propagation algorithm is immediately apparent from the form of H. For any 
network, the LPA allows an uninteresting trivial solution in which all vertices are assigned the same label Q. From 
H, we see that the trivial solution is in fact the globally optimal solution. Other solutions found by label propagation 
correspond to local maxima of H. 

As the LPA optimization procedure in Eq. (|5|) produces only local changes, the search for maxima in H is prone 
to becoming trapped at a local optimum instead of the global optimum. While normally a drawback of local search 
algorithms, this characteristic is essential to the function of the LPA: the trivial optimal solution is avoided by the 
dynamics of the local search algorithm, rather than through formal exclusion. 



IV. DRAWBACKS OF LABEL PROPAGATION 



The label propagation algorithm as a search scheme thus depends on a certain degree of ineffectuality. A typical 
way to attempt improvement of a local search algorithm is to make it more able to escape from local maxima in 
H. Such improvements to the LPA may be quite counterproductive, as better solutions in terms of H — notably, the 
global maximum — may be quite useless in practical terms. Despite this, label propagation in practice can produce 
communities that are of high quality in terms of, e.g., modularity: the local maxima are frustrated equilibria, with 
localized groups of well-connected vertices having the same label and with comparatively few edges between the 
groups. 

Generally, there is a poor correspondence between H and our conceptual understanding of communities. Maximizing 
H, be it by label propagation or another approach, need not produce better communities. Regardless, using the LPA 
works by maximizing H , raising the question of whether, and in what sense, we are improving community quality. 
Operationally, it is again unclear what it might mean to try improving the LPA. Does improving the search efficacy 
actually give better communities? How do we prevent our optimizations from reaching the global maximum of iJ, or 
other uninteresting solutions with high values of HI 

To illustrate the difficulties involved, we consider a possible optimization of the label propagation algorithm. When 
a vertex label is to be updated, it is necessary to handle the case where multiple labels are equally frequent for the 
neighboring vertices. In the standard LPA, these ties are broken by keeping the current label for the vertex, if it 
is one of the most frequent, or otherwise by selecting a label at random from the most frequent. In our optimized 
version, we will always select a label at random from the most frequent; in light of this additional randomization, we 
denote the modified algorithm as LPAr. The tie-breaking rule for the standard LPA corresponds to halting when a 
plateau in the H space is reached, while LPAr corresponds to allowing a random walk on the plateau in search of 
better solutions. 

In Fig. [T] we show the number of communities found for one thousand applications of the standard LPA and the 
putatively optimized LPAr to networks derived from the Southern women data. The data were collected by Davis 
et al. [lll | as part of an extensive study of class and race in the Deep South. The network represents interactions of 
a group of 18 women at 14 various events in and around Natchez, Mississippi during the 1930s. This much-studied 
network is typically found to have two communities using methods of social network analysis [ll| , in accord with the 
conclusions from the original ethnographic study. Unfortunately, our attempted optimization has a perverse result 
with the Southern women network: the principal effect of the optimization is to drastically increase the frequency 
at which the algorithm assigns the same label to all vertices, failing to capture any aspect of the known community 
structure. 

At least in the Southern women network, several practical drawbacks arise from the key conceptual drawback 
discussed above. Optimization is made difficult, as seen in this case based on comparison with a known community 
structure. Further, the objective function optimized by LPA provides no mechanism for testing the quality of the 
resulting community solutions — we must instead assess quality through auxiliary considerations such as the number 
of communities or, e.g., the modularity Q of the community solution. 
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FIG. 1: An attempted optimization of the label propagation algorithm produces dubious gains for the Southern women network. 
The modified LPAr frequently produces the trivial solution, with all vertices assigned to the same community. In the network 
considered, we expect at least two communities based on the ethnographic study from which the data is drawn. 

V. CONSTRAINED LABEL PROPAGATION 
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A well-established approach for eliminating undesirable solutions is to modify the objective function by adding a 
constraint term that penalizes the undesirable solutions. Denoting the modified objective function as H 1 and the 
penalty term as G, we have 



H' = H — AG 



(10) 



where A is a parameter that weights G against the original objective function H . Numerous choices are possible for 
G; we consider three possibilities below. 

Within the specific area of communication identification, the approach has been used at least since the landmark 
paper by Fu and Anderson [l3| applying methods of statistical mechanics to combinatorial optimization problems, 
including graph bipartitioning. We base a first penalty term Gi on their classic work. We seek to divide the vertices 
into groups of the same size. In terms of the labels, we define 



Gi 



1=1 \v=l / 
1 n n 



(11) 



The penalty term G\ produces the smallest value when all vertices have unique labels, and the largest value when all 
vertices have the same label. Thus, the trivial global optimum of H is penalized and hopefully avoided. 
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Alternatively, following a strategy that mirrors contemporary methods for community detection, we can try to 
divide the vertices into groups which have a similar total degree. We define a second penalty term G2 to capture this 
idea. The total degree Ki of the vertices with a given label I is 

n 

K l= J2hS{hJ) , (12) 

4=1 

where ki is the degree of vertex i. A suitable definition for G2 is 

1 " 

G z = \j2 K i ■ ( 13 ) 

1=1 

As with G\, G2 is minimal when all vertices have unique labels and maximal when all vertices have the same label, 
working to avoid the trivial global optimum. 
We can rewrite G2 in the form 

1 " ( " V 

1=1 \v=l ) 
_^ n n 

) ■ (14) 

Incorporating G2 into H' , we obtain 



2 

U—1 U—1 



i?' = i 5E ^E - Afe„fc„) 5 Z v ) . (15) 
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U—1 V — 1 



If we select 



where m is the number of edges in the network, the objective function may be written as 

H' = mQ . (17) 

In Eq. (|17p , Q is the standard modularity measure Q . 

Recalling that the label propagation rule as given by Eq. ([5]) requires only that a symmetric matrix be used, we 
can see from Eq. (|15|) that modularity can be be locally maximized by the label propagation algorithm; we denote 
this modularity-specialized algorithm as LPAm. Implementation issues are described in appendix [A] We note that 
LPAm, due to the effect of G2, is well suited to aggressive optimization, but we do not pursue such optimizations in 
the present work. 

The penalty term G2 plays the same role as the null model network used to define the modularity (see, e.g., Ref. 0]). 
The idea holds quite generally: various null model networks could be used to define specialized modularity measures, 
or penalty terms could equivalently be introduced into the objective function. This allows the interesting historical 
interpretation that Fu and Anderson [13j made use of a modularity measure for community identification over two 
decades ago. 

As a further example, we develop an analogous label propagation algorithm to maximize a recently introduced (l4j | 
version of modularity adapted to the important special class of bipartite networks. The vertices of a bipartite network 
can be partitioned into two disjoint sets such that no two vertices within the same set are adjacent; equivalently, 
the vertices in a bipartite graph can be assigned one of two colors, say red and blue, with no edges present between 
vertices bearing the same color. There are thus two distinct kinds of vertices, providing a natural representation 
for many affiliation or interaction networks, with one kind of vertex representing actors and the other representing 
relations. 

The distinction between the two parts of the network can be incorporated into a modularity measure by defining 
a suitable null network model. In contrast to the standard choice given in Eq. ^j, the two kinds of vertices must 
be treated separately, with non-zero probability of an edge only between vertices belonging to different parts of the 
network. For a red vertex i with degree ki and a blue vertex j with degree dj , the null model is defined so that 



(18) 
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Using Eq. (|T8|) . the bipartite modularity Q B is 



m * — ' \ m 

The sums in Eq. ((19)) are to be interpreted as running over the vertices in the two parts of the network, i.e., i is 
restricted to run over only the red vertices, while j is restricted to run over only the blue vertices. 

For the present work, it is simpler to allow unrestricted sums over all the vertices. To do this, for each vertex v we 
associate two degree measures, a red degree k v and a blue degree d v . If vertex v is red, we require d v = 0, while if 
it is blue we require k v = 0. In either case, the non-zero degree is the number of edges incident on the vertex. With 
this construction, Eq. (fT9|) becomes 

s B =^zx;(^-^W*) . (20) 

2=1 j=l V ' 

where now the sums run over all vertices. 

We now define a penalty term G3 for bipartite networks as 



1 " 

~£#*a > ( 21 ) 



2 
1=1 



G 3 
where 

n 

Ki = £M(i u ,0 , (22) 

u=l 
n 

Di = ^d u 6{l u ,l) . (23) 



Equations (|21|1 through (|23|) adapt Eqs. (|12|) and (|13p to bipartite networks. 
We can rewrite G3 as 



n / n n \ 

;=1 \«=1 «=1 / 

n n 
li— 1 1' — 1 

Writing the full objective function, we have 

- 71 71 

1; __ 1 

With 

A=- , (26) 
m 

Eq. (|25p becomes 

ff' = mQ B . (27) 



The label propagation rule can again be used to maximize the bipartite modularity; we denote the algorithm as LPAb. 
Implementation issues for LPAb are treated in appendix [B] 
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VI. APPLICATIONS AND PERFORMANCE 
A. Unipartite networks 

We now turn to a comparison of the quality of solutions found by the various label propagation algorithms discussed 
above. To quantify the solution quality, we will focus principally on the modularity Q, although it is not strictly 
optimized except by LPAm. Along with the LPA, LPAr, and LPAm variants discussed above, we will additionally 
consider a hybrid algorithm, consisting of the standard LPA followed by optimization with LPAm. The hybrid 
approach ensures that we are at a maximum in the modularity, rather than just finding a solution that hopefully 
offers a high value of Q. 

To begin, we apply the algorithms to randomly generated networks with a known community structure. The most 
typical such class of networks, introduced by Girvan and Newman [TB|, consists of four communities, each containing 
32 vertices. Edges exist between pairs of vertices belonging to the same community with probability p ln and between 
all other pairs of vertices with probability p ou t . The probabilities pi n and p ou t are set so as to preserve the average 
degree (k) of the vertices at a value of 16, while varying the average number of edges z out between a vertex and 
members of other communities. As z out increases, the communities become increasingly difficult to identify. Although 
these model networks differ significantly from real networks with community structure [16j , they do provide a simple 
initial test of community detection algorithms. 

In Fig. [2] we show the modularity values for communities found by the four algorithms. Each point shown gives the 
average modularity from communities found in 1000 instances of the random network model. As expected, Q drops 
as z ou t increases. 

Since we know the actual communities for the model networks, we may additionally assess the accuracy of the label 
assignments by directly comparing to the known values. We use the normalized mutual information /norm @ for the 
comparison. Consider two schemes X and Y for dividing the n vertices into community groups. The probability 
P (X — x,Y = y) that a vertex is assigned to community x in scheme X and to community y in scheme Y is taken to 
be proportional to the size of the intersection between the sets of vertices C x and C v constituting the communities, 
so that 

P(X = x,Y = y)= \ C * nC v\ . (28) 
n 

Using the probability as defined in Eq. I|28[). we can calculate the normalized mutual information as 

Iram(I,y) = h(x) + h\y) ■ (29) 

Equation (|29|) is expressed in terms of the usual mutual information I (X,Y) and entropies H {X) and H (X) [l7j |. 
defined as 

I(X,Y) = Y. P ^ Y )^ p\x)P{Y) (30) 
H(X) = -£P(X)logPpO (31) 

X 

H(Y) = - (Y) log P{Y) . (32) 
y 

In Eqs. (f2"5|) through (|32[) . we have made use of the common shorthand abbreviations P (X = x,Y = y) = P (X, Y), 
P(X = x) = P {X), and P (Y = y) = P (Y). The base of the logarithms in Eqs. ((201) through is arbitrary, as 
the computed measures only appear in the ratio in Eq. (|29p . 

The normalized mutual information allows us to measure the amount of information common to two different 
partitioning schemes. Accordingly, we can explore the efficacy of the algorithm by taking one of the partitions to be 
the known modular structure of the model networks and the other to be the structure found using label propagation. 
When the found modules match the real ones, we have / norm = 1, and when they are independent of the real ones, 
we have /norm = 0. Thus, as z ut increases, we expect /norm to decrease. In Fig. [3j we present values of /norm from 
comparison of the real communities to the same community solutions used for the Q calculations in Fig. [2J observing 
the expected decrease from / nom = 1 to I noim = 0. 

From Figs. [2] and [3l it is tempting to conclude that LPAm is superior to the other label propagation variants. 
However, this conclusion is not borne out when the algorithms are applied to real networks. In Table HI we list several 



9 



0.8 i- 

0.7 - 

0.6 - 

0.5 - 

0.4 - 

0.3 - 

0.2 - 

0.1 - 

- 





X 

i 



X 

H x 
S 



□ 



LPA + 

LPAm x 

LPAr * 

Hybrid □ 



xxxxxxxxxxxxx 



■ in 

8 



10 



12 



14 



16 



"OUt 



FIG. 2: Modularity Q of community solutions from random networks with known community structures. Each point shows 
the average Q over 1000 instances of the random networks in relation to the average number z out of inter-community links for 
each vertex. The hybrid algorithm consists of allowing the standard LPA to run its course and find a solution, followed by 
application of LPAm to the LPA solution in order to ensure that a local maximum of Q is reached. Error bars are smaller than 
the points. 



networks that we have investigated using the label propagation algorithms. The networks considered are a network 
of friendships between members of a university karate club 18]; a network of frequent associations between dolphins 
living near Doubtful Sound, New Zealand p^ |; a network of collaborations between jazz musicians [20| ; a network of 
co-authorships for scientific papers concerning networks [21( ; and a network of co-authorships for scientific preprints 
posted to the condensed matter archive [22j between the years 1995 and 2003 §[. We give their sizes in terms of 
the number of vertices n and number of edges m. To indicate the degree to which the networks feature community 
structures, we also provide the modularity Q, as determined using a greedy agglomerative hierarchical (GAH) method 
based on that of Clauset et al. [9J , wherein pairs of communities are successively merged so as to cause the largest 
possible increase in Q at each step. While edge weights are available in some cases, in this work we uniformly treat 
all network edges as unweighted. 

For each of the networks, we identify communities using each of the algorithms LPA, LPAm, and LPAr. Additionally, 
we consider a hybrid algorithm consisting of LPA followed by LPAm, thus ensuring that we are at a maximum of 
the Q. We applied each of the four algorithms one hundred times to each of the networks. In Table [TTJ we show the 
maximum modularity found in the samples, suggesting the potential performance, while in Table IIII1 we show the 
mean modularity, revealing the expected performance. From the tables, we can see that no algorithm variant is clearly 
superior, suggesting that the four variants all explore slightly different portions of the solution space. Interestingly, 
the LPAr variant, which worked poorly when applied to the Southern women network (section IIV[) . provides the best 
results on the two large co-authorship networks. We note that the label propagation variants produce community 
solutions with modularity values similar to those found with the GAH approach and shown in Table |TJ 
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FIG. 3: Accuracy of community solutions from random networks with known community structures. Accuracy is quantified by 
the normalized mutual information /norm between the found and actual community solutions. Each point shows the normalized 
mutual information /norm over 1000 instances of the random networks in relation to the average number z ou t of inter-community 
links for each vertex. The hybrid algorithm consists of allowing the standard LPA to run its course and find a solution, followed 
by application of LPAm to the LPA solution in order to ensure that a local maximum of Q is reached. Error bars are smaller 
than the points. 
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jazz 

network science 
condmat 2003 



n 
34 
62 
198 
1589 
31163 



m 
78 
159 
2742 
2742 
120029 



Q_ 

0.3807 
0.4923 
0.4389 
0.9555 
0.6885 



TABLE I: Basic properties of networks used to test label propagation algorithm variants. The sizes of the network are 
described by the number of vertices n and number of edges m. Each network has significant modular character, as indicated 
by the modularity Q. 



B. Bipartite networks 

As we did above for unipartite networks, we next quantify the quality of community solutions found in bipartite 
networks. We measure community quality using the bipartite modularity Q B , calculating values for the LPA, LPAr, 
and LPAb variants. Again, we consider a hybrid algorithm, consisting of LPA followed by LPAb, ensuring that the 
solutions are at maxima in Q B . 
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Network 
karate 
dolphins 
jazz 

network science 
condmat 2003 



LPA 
0.4156 
0.5237 
0.4424 
0.8924 
0.6228 



LPAm 
0.4000 
0.5157 
0.4448 
0.8723 
0.5947 



LPAr 
0.4156 
0.5265 
0.4428 
0.9163 
0.6578 



Hybrid 

0.4198 

0.5253 

0.4442 

0.8934 

0.6360 



TABLE II: Maximum modularity Q found for network community assignments. Values were calculated using one hundred 
samples for each network for each of the standard LPA, LPAm, LPAr, and a hybrid approach consisting of maximization with 
LPA followed by maximization with LPAm. 



Network 

karate 

dolphins 

jazz 

network science 
condmat 2003 



LPA 
0.366(6) 
0.484(4) 
0.336(9) 
0.8792(6) 
0.6073(6) 



LPAm 

0.347(3) 

0.4956(8) 

0.4351(9) 

0.8618(5) 

0.5828(4) 



LPAr 
0.352(9) 
0.484(5) 
0.34(1) 
0.9046(5) 
0.6420(6) 



Hybrid 
0.386(4) 
0.495(3) 
0.366(7) 
0.8806(6) 
0.6139(9) 



TABLE III: Mean modularity Q found for network community assignments. Values were calculated using one hundred samples 
for each network for each of the standard LPA, LPAm, LPAr, and a hybrid approach consisting of maximization with LPA 
followed by maximization with LPAm. The uncertainty of the final digit, calculated as the standard error of the mean, is shown 
parenthetically. 



We examine the performance using four real-world bipartite networks. The networks are the Southern women 
network, described above in section lTVl a network describing corporate interlocks in Scotland, based on the membership 
of boards of directors for Scottish firms during 1904-5 [23| : and bipartite versions of the condensed matter and network 
science co-authorship networks considered in section IVI A[ including authors and their papers as the two parts of the 
network. In Table IIV1 we indicate the size and extent of community structure in the networks. We show the size 
using the number of vertices p and q in the two parts of the networks, as well as the number of edges m. We show 
the extent of community structure using the bipartite modularity Q B , as determined using a greedy agglomerative 
hierarchical method, analogous to that commonly used for unipartite networks [alSl- 

To each network, we apply each label propagation algorithm one hundred times. The maximum and mean values 
found for Q B are given in Tables [V] and IVI1 respectively. For the Southern women network, we note that LPAr is 
clearly the worst of the algorithms considered, consistent with its tendency to assign the same label to all vertices, as 
seen in Fig. [1] Further, the improved performance of LPAb on the Southern women network in terms of the average 
Q indicates that the inclusion of G3 reduces the frequent appearance of the trivial solution with all vertices in the 
same community. 

Despite the success of LPAb on the Southern women network, it is less successful on the other networks. Performance 
is quite similar for LPA and LPAb on the Scotland corporate interlocks network, but LPAb is otherwise outperformed 
by the other label propagation variants. Indeed, LPAr provides the best results for the larger networks, in contrast 
to its poor results for the Southern women network. Values of Q for community solutions found using the label 
propagation variants are generally somewhat less than the values, shown in Table II V| for communities found using a 



Network 
Southern women 
Scotland interlocks 
network science 
condmat 2003 



P_ 

14 
108 
959 
31162 



18 
136 
1588 
47055 



89 
358 
2580 
134600 



0.3430 
0.6969 
0.9695 
0.8700 



TABLE IV: Basic properties of bipartite networks used to test label propagation algorithm variants. The sizes of the network 
are described by the numbers of vertices p and q in the two parts of the network and by the number of edges m. Each network 
has significant modular character, as indicated by the bipartite modularity Q B . 
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Network 


LPA 


LPAb 


LPAr 


Hybrid 


Southern women 
Scotland interlocks 
network science 
condmat 2003 


0.3212 
0.5782 
0.8137 
0.6378 


0.3192 
0.5783 
0.7807 
0.6179 


0.3184 
0.6552 
0.8948 
0.7232 


0.3257 
0.5975 
0.8172 
0.6587 


TABLE V: Maximum bipartite modularity Q B found for bipartite network community assignments. Values were calculated 
using one hundred samples for each network for each of the standard LPA, LPAb, LPAr, and a hybrid approach consisting of 
maximization with LPA followed by maximization with LPAb. 


Network 


LPA 


LPAb 


LPAr 


Hybrid 


Southern women 
Scotland interlocks 
network science 
condmat 2003 


0.19(1) 
0.543(1) 
0.788(1) 
0.6314(3) 


0.250(3) 
0.548(2) 
0.7624(6) 
0.6142(1) 


0.17(1) 
0.633(1) 
0.8733(8) 
0.7183(2) 


0.27(1) 
0.568(1) 
0.7986(8) 
0.6536(2) 



TABLE VI: Mean bipartite modularity Q found for bipartite network community assignments. Values were calculated using 
one hundred samples for each network for each of the standard LPA, LPAb, LPAr, and a hybrid approach consisting of 
maximization with LPA followed by maximization with LPAb. 

greedy agglomerative hierarchical approach. 

VII. DISCUSSION 

We have examined the label-propagation algorithm as an optimization problem, identifying community solutions 
that it finds with the maxima of an objective function. The objective function, which is just the number of network 
edges connecting vertices with the same labels, has the significant conceptual drawback that increasing the objective 
function need not produce what we would consider to be better communities. Markedly, the globally optimal solution 
is completely uninformative, with all vertices in the same community. Label propagation thus depends on reaching 
one of the large number of local maxima in the objective function to avoid the trivial global solution. Attempts to 
improve on the algorithm may be counterproductive, giving less information while reaching nominally better solutions. 
By modifying the objective function, we defined several label-propagation algorithms that are constrained to avoid 
assigning all vertices to the same community. One of the constrained label-propagation algorithms, LPAm, finds local 
maxima in the modularity Q; another, LPAb, finds local maxima in a modified modularity Q B for bipartite networks. 

Although formally equivalent, there are important conceptual differences between the usual definition of the mod- 
ularity Q in terms of a null model network and the version based on constraints presented here. For example, the 
parameter A seems quite arbitrarily chosen in the constraint-based version. In fact, the community solutions found 
by LPAm are not especially sensitive to the choice of A. The value can, for instance, be cut in half to A = l/4m 
with significant change only in the case of the mean modularity for Zachary's karate network — in which the mean 
modularity value actually increases by about 10%. 

More significantly, the constraint as given in Eq. (|13[) makes clear that modularity favors communities of similar size, 
with size measured by the total degree of the vertices in the community. As the distribution of community sizes may be 
far from uniform (see, for example, Fig. 3 in Ref. Q), the constraint approach points immediately towards a practical 
difficulty in detecting community by maximizing modularity. In contrast, difficulties due to varying community sizes 
were recognized [2_4j only some time after the original introduction of modularity using a null model. 

Corresponding properties hold in the case of the bipartite modularity Q®. Again, A seems arbitrarily chosen; 
halving the parameter value to A = 1/m again only causes a significant change for the small Southern women 
network, increasing the mean bipartite modularity found by about 10%. In the bipartite case, communities of similar 
size are also favored, but the relevant size is now the geometric mean of the total degrees within the community for 
the two parts of the network, as seen in Eq. (|2ip. We thus expect that community identification methods based on 
maximizing Q B will also have difficulties with networks consisting of communities of diverse sizes. Although this latter 
fact has been anticipated [l4j based on parallels to the unipartite case, it has not been previously demonstrated. 

In light of the results for the real- world networks (Tables HT1 and [TTTI for unipartite networks, Tables fVl and fVTl for 
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bipartite networks), it seems clear that the main label propagation variants we have considered — LPA, LPAm, LPAr, 
LPAb — all give good community results. The performance differences indicate that the algorithm variants explore 
slightly different portions of the community solution space. No variant is clearly superior, which is not surprising 
given that we are trying to identify communities without prior information on their number, size, or nature. 

When compared to the modularity values for community solutions generated by greedy agglomerative hierarchical 
methods, the label propagation variants appear to provide no advantage or, in the case of bipartite networks, to entail 
a distinct disadvantage. We stress that the difference in modularity values should not be overvalued, for two main 
reasons. First, the modularity measure, while popular, is not the only possibility, nor is it without drawbacks (see, 
e.g., Ref. [25]). Second, the algorithms are quite different, so no single point of comparison will be determinative 
in general. A more thorough characterization of performance is needed to establish reliable guidelines for choosing 
appropriate algorithms to analyze particular networks; this will be the subject of future work. 

The performance of LPAm is especially interesting: although it is the only variant directly maximizing Q, other 
variants produce better results in terms of Q for some of the networks considered. This appears to be due to a 
fundamental difference in the role played by the modularity in the algorithm variants. Lacking an objective function, 
Raghavan et al. [3] used the modularity of the final community solution to assess the acceptability of their LPA, as 
did we when assessing LPAr in the present work. Thus, in LPA and LPAr, the modularity is used diagnostically to 
select a best result from candidate solutions produced based on other considerations. In contrast, the modularity 
plays an essential role in LPAm, impacting the final community solution as well as the intermediate community states 
reached during the course of the algorithm. The dynamical path followed through the space of label assignments is 
driven to favor states where all communities are similar in total degree, although there is little reason to believe such 
paths are universally ideal or particularly free of local maxima. Thus, the null model network used in defining the 
modularity — regardless of its suitability as a model of the final communities — may be an impractical model of the 
intermediate communities. This might be addressed by varying G, gradually introducing the penalty term G2 and 
thus the null network mode. Similar considerations hold for LPAb and the corresponding null network models for 
bipartite networks. 

Overall, we have found the label propagation algorithm to be a promising approach to understanding networks, 
with a number of desirable qualities. Label propagation seems well suited as a basis for more specialized community- 
detection methods, as well as application to other aspects of networks besides community structure. A clear under- 
standing of the drawbacks of label propagation, as well as its strengths, will help to avoid problems and facilitate 
further applications. 
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APPENDIX A: A LABEL-PROPAGATION ALGORITHM FOR MAXIMIZING MODULARITY 

The label propagation algorithm presented by Raghavan et al. [4j has desirable performance properties. Each 
relabeling iteration through the vertices has a computational (time) complexity O (m) linear in the number of edges 
?7i in the graph. For many networks, the number of vertices n scales with the number of edges, so the computational 
complexity for each relabeling step can instead be given as O (n). 

As seen in section [V] the objective function for the LPA can be constrained to reproduce the modularity. Conse- 
quently, it is necessary to adapt the algorithm itself to obtain an efficient procedure for maximizing the modularity. 
Modifications can be made so as to maintain the O (m) time complexity. Here, we consider the constraint G2 given 
in Eq. (|14[) . i.e., we implement LPAm. 

First, consider the objective function from Eq. (J7J). Recall that the LPA update rule (Eq. ([5])) can be applied 
with any symmetric matrix B uv playing the role of the adjacency matrix A uv (see section Hill) . Further, it is clear 
that the objective function may be shifted by adding an arbitrary constant C without altering the locations of the 
maxima in the space of label assignments. By setting C — — Yl u =i Bum we eliminate the diagonal elements B uu from 
consideration, producing an objective function 



n 




(Al) 



v—1 u /- r 



and update rule 




(A2) 



The above transformation eliminates constant self-interaction terms. 
Next, identify B uv as A uv — Xk u k v to match the LPAm variant, giving 




(A3) 



or, equivalently, 




(A4) 
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The first sum in Eq. (|A4[) corresponds to the counting of labels on neighboring vertices in the original label propagation 
algorithm. Write this as 

N vl =J2 A -uv5(kJ) ■ (A5) 

The second sum in Eq. (|A4|) can be rewritten as 

Y J k u 5{l u ,l)=Ki-k v 8{l v ,l) , (A6) 

where 

n 

Ki=J2KS(luJ) ■ (A7) 

u=l 

Analogously to the volume of a graph, Ki can be viewed as a sort of volume for the labels. 
Incorporating Eqs. (|A5j) and (|A7j) into Eq. (IA4j) . we obtain 

= argmax (N vt - \k v K l + \k 2 v 5 (l v ,l)) . (A8) 
i 

The modified label propagation rule, as expressed in Eq. (|A8|) . can be readily implemented so that each pass through 
the vertices requires O (to) worst-case time complexity. 

The algorithm is initialized by assigning a unique numerical label I to each vertex and by setting Ki to the degree 
of the vertex. The first term, N v i, requires that the labels of the neighbors for each vertex be counted and is thus 
O (to); this is unsurprising as it is equivalent to the unmodified label propagation algorithm, which is 0(rn). The 
second term appears to require that each possible label be checked for each vertex, giving O (n 2 ) . However, it is only 
necessary to consider the labels of the neighbors for each vertex — no other label can make a positive contribution to 
the modularity, but a zero contribution can be had by assigning an unused label. A list of unused labels can be kept, 
allowing O (1) access. Additionally, the Ki must be updated if the label changes, but this is also O (1) for each vertex. 
In total, checking and updating the K[ terms for all vertices is O (to) . The final term in Eq. (|A8[) is O (n) in total. 
With all three terms taken into account, the modified algorithm thus has worst-case O (to) time complexity. 



APPENDIX B: A LABEL-PROPAGATION ALGORITHM FOR MAXIMIZING BIPARTITE 

MODULARITY 



In Eq. (f2"5|) , we have presented an objective function corresponding to the bipartite modularity Q B , with form 

n n 



2 

U— 1 V— 1 



We cannot directly apply the label propagation update rule from Eq. ([5]), as A uv — Xk u d v is in general asymmetric. 
Despite this, we can define a label propagation rule for H'. 

We rewrite Eq. (|25[) by first taking advantage of the symmetry of A uv and 6 (l U) l v ), giving 



n n 

H '= 2l2J2( A ™- Xk ud v )S(lv,lu) ■ (B2) 

U— 1 V — 1 

Next, we switch the dummy indices u and resulting in 

H' = -^2^2(A UV -Xk v d u )6(l u ,l v ) . (B3) 

U— 1 V— 1 

Averaging Eqs. (|B1|) and (|B3|) . we obtain 

1 - n ( A \ 

H ' = o22z2 \A uv --(k u d v + k v d u ) )5{l u ,l v ) , (B4) 



2 

U—l V—l 
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which is in terms of a symmetric matrix and thus suitable for use with Eq. ([5]). 

The objective function, as expressed in Eq. (|B4|) . can be converted into the LPAb label propagation rule for bipartite 
modularity in a fashion directly parallel to that presented in appendix [A] The resulting update rule has the form 

l' v = argmax U vl - - ^-D, + h 2 v 5 (l v ,l) + ^d 2 J (l v ,l)\ , (B5) 

where 

n 

Ki = ^W(l a ,0 , (B6) 

j ; 

Di = ^d u 6{l u ,l) . (B7) 



By updating Ki and Di when labels change, the algorithm can be implemented efficiently. The details, omitted here, 
are similar to those given in appendix [X] and result in the same O (to) worst-case time complexity for each iteration 
of LPAb. 



